👁️ Attention Optimization - miterion · Scour

A Minimal Route to Transformer Attention

neelsomaniblog.com·2d·

Discuss: Hacker News

🧩Attention Kernels

Flag this post

[D] Best (free) courses on neural networks

reddit.com·3h·

Discuss: r/MachineLearning

🧩Attention Kernels

Flag this post

Neural bases of sustained attention during naturalistic parent-infant interactions

nature.com·1d

🧩Attention Kernels

Flag this post

Everything About Transformers

krupadave.com·2d

🧩Attention Kernels

Flag this post

Your Transformer is Secretly an EOT Solver

elonlit.com·1d·

Discuss: Hacker News

⚡Flash Attention

Flag this post

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

paperium.net·1d·

Discuss: DEV

🏎️TensorRT

Flag this post

Minimax pre-training lead explains why no linear attention

reddit.com·2d·

Discuss: r/LocalLLaMA

⚡Flash Attention

Flag this post

Clarity From Chaos: AI Super-Resolution Redefined

dev.to·12h·

Discuss: DEV

⚡Flash Attention

Flag this post

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

arxiv.org·1d

Flag this post

RF-DETR Under the Hood: The Insights of a Real-Time Transformer Detection

towardsdatascience.com·1d

🏎️TensorRT

Flag this post

Show HN: Hot or Slop – Visual Turing test on how well humans detect AI images

hotorslop.com·1d·

Discuss: Hacker News

⚡Flash Attention

Flag this post

Dual-format attentional template during preparation in human visual cortex

elifesciences.org·3d

🧩Attention Kernels

Flag this post

An underqualified reading list about the transformer architecture

fvictorio.github.io·2d·

Discuss: Hacker News

🧩Attention Kernels

Flag this post

<p>**Abstract:** Traumatic brain injury (TBI) significantly increases the long-term risk of Alzheimer’s disease (AD). Early identification of biomarkers predict...

freederia.com·2d

Flag this post

**Breaking the Curse of Dimensionality: A Game-Changer for L

dev.to·1d·

Discuss: DEV

🧩Attention Kernels

Flag this post

Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs

arxiv.org·3d

🧩Attention Kernels

Flag this post

🧠 Soft Architecture (Part B): Emotional Timers and the Code of Care (Part 5 of the SaijinOS series)

dev.to·6h·

Discuss: DEV

🤖AI Coding Tools

Flag this post

After distractions, rotating brain waves may help thought circle back to the task

medicalxpress.com·1d

⚡Flash Attention

Flag this post

Brumby-14B-Base: The Strongest Attention-Free Base Model

manifestai.com·2d·

Discuss: Hacker News

🏎️TensorRT

Flag this post

Emergent introspective awareness in large language models

transformer-circuits.pub·1d·

Discuss: Hacker News

⚡Flash Attention

Flag this post

Loading more...